{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Project 5 - Goal 2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the previous goal, we create a class that was both a context manager and an iterator.\n",
    "\n",
    "Here we want to create our context manager using a generator function instead.\n",
    "\n",
    "Let's first recall what our context manager looked like in Goal 1:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import csv\n",
    "from collections import namedtuple\n",
    "\n",
    "def get_dialect(f_name):\n",
    "    with open(f_name) as f:\n",
    "        return csv.Sniffer().sniff(f.read(1000))\n",
    "\n",
    "class FileParser:\n",
    "    def __init__(self, f_name):\n",
    "        self.f_name = f_name\n",
    "        \n",
    "    def __enter__(self):\n",
    "        self._f = open(self.f_name, 'r')\n",
    "        self._reader = csv.reader(self._f, get_dialect(self.f_name))\n",
    "        headers = map(lambda x: x.lower(), next(self._reader))\n",
    "        self._nt = namedtuple('Data', headers)\n",
    "        return self\n",
    "        \n",
    "    def __exit__(self, exc_type, exc_value, exc_tb):\n",
    "        self._f.close()\n",
    "        return False\n",
    "    \n",
    "    def __iter__(self):\n",
    "        return self\n",
    "    \n",
    "    def __next__(self):\n",
    "        if self._f.closed:\n",
    "            # file has been closed - so we're can't iterate anymore!\n",
    "            raise StopIteration\n",
    "        else:\n",
    "            return self._nt(*next(self._reader))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So, we want to reproduce this using a generator function. The problem is that we implemented two protocols in here, and the context manager decorator relies on a specific structure to the generator function.\n",
    "\n",
    "Obviously we'll have to separate those two protocols out in order to make this work.\n",
    "\n",
    "First we'll create a generator function to iterate through the data - it will need to have the data iterator from `csv.reader` as well as the named tuple class we want to use.\n",
    "\n",
    "Let's write that first:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def parsed_data_iter(data_iter, nt):\n",
    "    for row in data_iter:\n",
    "        yield nt(*row)   "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we have to create our context manager using a generator function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from contextlib import contextmanager"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import csv\n",
    "from collections import namedtuple\n",
    "    \n",
    "@contextmanager\n",
    "def parsed_data(f_name):\n",
    "    f = open(f_name, 'r')\n",
    "    try:\n",
    "        reader = csv.reader(f, get_dialect(f_name))\n",
    "        headers = map(lambda x: x.lower(), next(reader))\n",
    "        nt = namedtuple('Data', headers)\n",
    "        yield parsed_data_iter(reader, nt)\n",
    "    finally:\n",
    "        f.close()    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Data(ssn='100-53-9824', first_name='Sebastiano', last_name='Tester', gender='Male', language='Icelandic')\n",
      "Data(ssn='101-71-4702', first_name='Cayla', last_name='MacDonagh', gender='Female', language='Lao')\n",
      "Data(ssn='101-84-0356', first_name='Nomi', last_name='Lipprose', gender='Female', language='Yiddish')\n",
      "Data(ssn='104-22-0928', first_name='Justinian', last_name='Kunzelmann', gender='Male', language='Dhivehi')\n",
      "Data(ssn='104-84-7144', first_name='Claudianus', last_name='Brixey', gender='Male', language='Afrikaans')\n"
     ]
    }
   ],
   "source": [
    "from itertools import islice\n",
    "\n",
    "with parsed_data('personal_info.csv') as data:\n",
    "        for row in islice(data, 5):\n",
    "            print(row)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And this should work with our other file too:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Data(car='Chevrolet Chevelle Malibu', mpg='18.0', cylinders='8', displacement='307.0', horsepower='130.0', weight='3504.', acceleration='12.0', model='70', origin='US')\n",
      "Data(car='Buick Skylark 320', mpg='15.0', cylinders='8', displacement='350.0', horsepower='165.0', weight='3693.', acceleration='11.5', model='70', origin='US')\n",
      "Data(car='Plymouth Satellite', mpg='18.0', cylinders='8', displacement='318.0', horsepower='150.0', weight='3436.', acceleration='11.0', model='70', origin='US')\n",
      "Data(car='AMC Rebel SST', mpg='16.0', cylinders='8', displacement='304.0', horsepower='150.0', weight='3433.', acceleration='12.0', model='70', origin='US')\n",
      "Data(car='Ford Torino', mpg='17.0', cylinders='8', displacement='302.0', horsepower='140.0', weight='3449.', acceleration='10.5', model='70', origin='US')\n"
     ]
    }
   ],
   "source": [
    "with parsed_data('cars.csv') as data:\n",
    "    for row in islice(data, 5):\n",
    "        print(row)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we can define functions inside other functions in Python, so let's make our context manager more self-contained by integrating both utility functions directly into it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "@contextmanager\n",
    "def parsed_data(f_name):\n",
    "    def get_dialect(f_name):\n",
    "        with open(f_name) as f:\n",
    "            return csv.Sniffer().sniff(f.read(1000))\n",
    "    \n",
    "    def parsed_data_iter(data_iter, nt):\n",
    "        for row in data_iter:\n",
    "            yield nt(*row) \n",
    "        \n",
    "    f = open(f_name, 'r')\n",
    "    try:\n",
    "        reader = csv.reader(f, get_dialect(f_name))\n",
    "        headers = map(lambda x: x.lower(), next(reader))\n",
    "        nt = namedtuple('Data', headers)\n",
    "        yield parsed_data_iter(reader, nt)\n",
    "    finally:\n",
    "        f.close()  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Data(car='Chevrolet Chevelle Malibu', mpg='18.0', cylinders='8', displacement='307.0', horsepower='130.0', weight='3504.', acceleration='12.0', model='70', origin='US')\n",
      "Data(car='Buick Skylark 320', mpg='15.0', cylinders='8', displacement='350.0', horsepower='165.0', weight='3693.', acceleration='11.5', model='70', origin='US')\n",
      "Data(car='Plymouth Satellite', mpg='18.0', cylinders='8', displacement='318.0', horsepower='150.0', weight='3436.', acceleration='11.0', model='70', origin='US')\n",
      "Data(car='AMC Rebel SST', mpg='16.0', cylinders='8', displacement='304.0', horsepower='150.0', weight='3433.', acceleration='12.0', model='70', origin='US')\n",
      "Data(car='Ford Torino', mpg='17.0', cylinders='8', displacement='302.0', horsepower='140.0', weight='3449.', acceleration='10.5', model='70', origin='US')\n",
      "-------------------\n",
      "Data(ssn='100-53-9824', first_name='Sebastiano', last_name='Tester', gender='Male', language='Icelandic')\n",
      "Data(ssn='101-71-4702', first_name='Cayla', last_name='MacDonagh', gender='Female', language='Lao')\n",
      "Data(ssn='101-84-0356', first_name='Nomi', last_name='Lipprose', gender='Female', language='Yiddish')\n",
      "Data(ssn='104-22-0928', first_name='Justinian', last_name='Kunzelmann', gender='Male', language='Dhivehi')\n",
      "Data(ssn='104-84-7144', first_name='Claudianus', last_name='Brixey', gender='Male', language='Afrikaans')\n",
      "-------------------\n"
     ]
    }
   ],
   "source": [
    "f_names = 'cars.csv', 'personal_info.csv'\n",
    "for f_name in f_names:\n",
    "    with parsed_data(f_name) as data:\n",
    "        for row in islice(data, 5):\n",
    "            print(row)\n",
    "    print('-------------------')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So that works just fine. But do we really need that parsed_data_iter function?\n",
    "\n",
    "Maybe we can rewrite it using a generator expression instead:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "@contextmanager\n",
    "def parsed_data(f_name):\n",
    "    def get_dialect(f_name):\n",
    "        with open(f_name) as f:\n",
    "            return csv.Sniffer().sniff(f.read(1000))\n",
    "    f = open(f_name, 'r')\n",
    "    try:\n",
    "        reader = csv.reader(f, get_dialect(f_name))\n",
    "        headers = map(lambda x: x.lower(), next(reader))\n",
    "        nt = namedtuple('Data', headers)\n",
    "        yield (nt(*row) for row in reader)\n",
    "    finally:\n",
    "        f.close()  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Data(car='Chevrolet Chevelle Malibu', mpg='18.0', cylinders='8', displacement='307.0', horsepower='130.0', weight='3504.', acceleration='12.0', model='70', origin='US')\n",
      "Data(car='Buick Skylark 320', mpg='15.0', cylinders='8', displacement='350.0', horsepower='165.0', weight='3693.', acceleration='11.5', model='70', origin='US')\n",
      "Data(car='Plymouth Satellite', mpg='18.0', cylinders='8', displacement='318.0', horsepower='150.0', weight='3436.', acceleration='11.0', model='70', origin='US')\n",
      "Data(car='AMC Rebel SST', mpg='16.0', cylinders='8', displacement='304.0', horsepower='150.0', weight='3433.', acceleration='12.0', model='70', origin='US')\n",
      "Data(car='Ford Torino', mpg='17.0', cylinders='8', displacement='302.0', horsepower='140.0', weight='3449.', acceleration='10.5', model='70', origin='US')\n"
     ]
    }
   ],
   "source": [
    "with parsed_data('cars.csv') as data:\n",
    "    for row in islice(data, 5):\n",
    "        print(row)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Good, that's one simplification we could make.\n",
    "\n",
    "How about that `get_dialect` function?\n",
    "Since that function is inside our main function, we really don't have to pass the file name to it - it already has access to it from the outer scope:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "@contextmanager\n",
    "def parsed_data(f_name):\n",
    "    def get_dialect():\n",
    "        with open(f_name) as f:\n",
    "            return csv.Sniffer().sniff(f.read(1000))\n",
    "        \n",
    "    f = open(f_name, 'r')\n",
    "    try:\n",
    "        reader = csv.reader(f, get_dialect())\n",
    "        headers = map(lambda x: x.lower(), next(reader))\n",
    "        nt = namedtuple('Data', headers)\n",
    "        yield (nt(*row) for row in reader)\n",
    "    finally:\n",
    "        f.close()  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Data(car='Chevrolet Chevelle Malibu', mpg='18.0', cylinders='8', displacement='307.0', horsepower='130.0', weight='3504.', acceleration='12.0', model='70', origin='US')\n",
      "Data(car='Buick Skylark 320', mpg='15.0', cylinders='8', displacement='350.0', horsepower='165.0', weight='3693.', acceleration='11.5', model='70', origin='US')\n",
      "Data(car='Plymouth Satellite', mpg='18.0', cylinders='8', displacement='318.0', horsepower='150.0', weight='3436.', acceleration='11.0', model='70', origin='US')\n",
      "Data(car='AMC Rebel SST', mpg='16.0', cylinders='8', displacement='304.0', horsepower='150.0', weight='3433.', acceleration='12.0', model='70', origin='US')\n",
      "Data(car='Ford Torino', mpg='17.0', cylinders='8', displacement='302.0', horsepower='140.0', weight='3449.', acceleration='10.5', model='70', origin='US')\n"
     ]
    }
   ],
   "source": [
    "with parsed_data('cars.csv') as data:\n",
    "    for row in islice(data, 5):\n",
    "        print(row)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But we really don't have to make it into a function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "@contextmanager\n",
    "def parsed_data(f_name):\n",
    "    with open(f_name) as f:\n",
    "        dialect = csv.Sniffer().sniff(f.read(1000))\n",
    "\n",
    "    f = open(f_name, 'r')\n",
    "    try:\n",
    "        reader = csv.reader(f, dialect)\n",
    "        headers = map(lambda x: x.lower(), next(reader))\n",
    "        nt = namedtuple('Data', headers)\n",
    "        yield (nt(*row) for row in reader)\n",
    "    finally:\n",
    "        f.close()  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Data(car='Chevrolet Chevelle Malibu', mpg='18.0', cylinders='8', displacement='307.0', horsepower='130.0', weight='3504.', acceleration='12.0', model='70', origin='US')\n",
      "Data(car='Buick Skylark 320', mpg='15.0', cylinders='8', displacement='350.0', horsepower='165.0', weight='3693.', acceleration='11.5', model='70', origin='US')\n",
      "Data(car='Plymouth Satellite', mpg='18.0', cylinders='8', displacement='318.0', horsepower='150.0', weight='3436.', acceleration='11.0', model='70', origin='US')\n",
      "Data(car='AMC Rebel SST', mpg='16.0', cylinders='8', displacement='304.0', horsepower='150.0', weight='3433.', acceleration='12.0', model='70', origin='US')\n",
      "Data(car='Ford Torino', mpg='17.0', cylinders='8', displacement='302.0', horsepower='140.0', weight='3449.', acceleration='10.5', model='70', origin='US')\n"
     ]
    }
   ],
   "source": [
    "with parsed_data('cars.csv') as data:\n",
    "    for row in islice(data, 5):\n",
    "        print(row)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Technically, we don't even have to open that file twice - once to sniff the dialect, the other to iterate over the file.\n",
    "\n",
    "With the file object, we can read some data, and then \"rewind\" - using the `seek` method. This is not the same as using the `iterator` protocol which the file objects also implement.\n",
    "\n",
    "Here's a simple example of how that works:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--- Car;MPG;Cylinders;Displacement;Horsepower;Weight;Acceleration;Model;Origin\n",
      "Chevrolet Chevelle Malibu;18.0;8;307.0;130.0;\n",
      "--- 3504.;12.0;70;US\n",
      "Buick Skylark 320;15.0;8;350.0;165.0;3693.;11.5;70;US\n",
      "Plymouth Satellite;18.0;8;318.0;150.0;3436.;11.0;\n",
      "--- Car;MPG;Cylinders;Displacement;Horsepower;Weight;Acceleration;Model;Origin\n",
      "Chevrolet Chevelle Malibu;18.0;8;307.0;130.0;\n"
     ]
    }
   ],
   "source": [
    "with open('cars.csv') as f:\n",
    "    print('---', f.read(120))\n",
    "    print('---', f.read(120))\n",
    "    f.seek(0)\n",
    "    print('---', f.read(120))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And we can still iterator over the lines after rewinding:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--- Car;MPG;Cylinders;Displacement;Horsepower;Weight;A\n",
      "Car;MPG;Cylinders;Displacement;Horsepower;Weight;Acceleration;Model;Origin\n",
      "Chevrolet Chevelle Malibu;18.0;8;307.0;130.0;3504.;12.0;70;US\n",
      "Buick Skylark 320;15.0;8;350.0;165.0;3693.;11.5;70;US\n",
      "Plymouth Satellite;18.0;8;318.0;150.0;3436.;11.0;70;US\n",
      "AMC Rebel SST;16.0;8;304.0;150.0;3433.;12.0;70;US\n"
     ]
    }
   ],
   "source": [
    "with open('cars.csv') as f:\n",
    "    print('---', f.read(50))\n",
    "    f.seek(0)\n",
    "    for row in islice(f, 5):\n",
    "        print(row, end='')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So let's use that to simplify our code further:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "@contextmanager\n",
    "def parsed_data(f_name):\n",
    "    f = open(f_name, 'r')\n",
    "    try:\n",
    "        dialect = csv.Sniffer().sniff(f.read(1000))\n",
    "        f.seek(0)\n",
    "        reader = csv.reader(f, dialect)\n",
    "        headers = map(lambda x: x.lower(), next(reader))\n",
    "        nt = namedtuple('Data', headers)\n",
    "        yield (nt(*row) for row in reader)\n",
    "    finally:\n",
    "        f.close()  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Data(car='Chevrolet Chevelle Malibu', mpg='18.0', cylinders='8', displacement='307.0', horsepower='130.0', weight='3504.', acceleration='12.0', model='70', origin='US')\n",
      "Data(car='Buick Skylark 320', mpg='15.0', cylinders='8', displacement='350.0', horsepower='165.0', weight='3693.', acceleration='11.5', model='70', origin='US')\n",
      "Data(car='Plymouth Satellite', mpg='18.0', cylinders='8', displacement='318.0', horsepower='150.0', weight='3436.', acceleration='11.0', model='70', origin='US')\n",
      "Data(car='AMC Rebel SST', mpg='16.0', cylinders='8', displacement='304.0', horsepower='150.0', weight='3433.', acceleration='12.0', model='70', origin='US')\n",
      "Data(car='Ford Torino', mpg='17.0', cylinders='8', displacement='302.0', horsepower='140.0', weight='3449.', acceleration='10.5', model='70', origin='US')\n",
      "-------------------\n",
      "Data(ssn='100-53-9824', first_name='Sebastiano', last_name='Tester', gender='Male', language='Icelandic')\n",
      "Data(ssn='101-71-4702', first_name='Cayla', last_name='MacDonagh', gender='Female', language='Lao')\n",
      "Data(ssn='101-84-0356', first_name='Nomi', last_name='Lipprose', gender='Female', language='Yiddish')\n",
      "Data(ssn='104-22-0928', first_name='Justinian', last_name='Kunzelmann', gender='Male', language='Dhivehi')\n",
      "Data(ssn='104-84-7144', first_name='Claudianus', last_name='Brixey', gender='Male', language='Afrikaans')\n",
      "-------------------\n"
     ]
    }
   ],
   "source": [
    "f_names = 'cars.csv', 'personal_info.csv'\n",
    "for f_name in f_names:\n",
    "    with parsed_data(f_name) as data:\n",
    "        for row in islice(data, 5):\n",
    "            print(row)\n",
    "    print('-------------------')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
